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As currently designed and flown, spacecraft need considerable maintenance to perform 
their missions. Mission readiness is jeopardized, however, because the ground support that 
provides the maintenance is vulnerable to both hostile action and operator errors. To 
address this, the Jet Propulsion Laboratory (JPL), California Institute of Technology, was 
commissioned in March 1980 by the Air Force Ofl'ice of Scientific Research to lead a 
study of autonomous spacecraft maintenance (ASM). ASM is a spacecraft design that 
tolerates hardware and software failures and design faults, while requiring minim um 
ground contact to perform the mission. The study group, composed of experts from 
industry, academia, and NASA, was to identify critical issues related to ASM technology 
development and detail the infusion of this technology into future Air Force spacecraft 
systems. To tacilitate this, three subgroups were formed: the Spacecraft System Tech- 
nology Working Group, composed of spacecraft system specialists from various spacecraft 
suppliers, the Fault-Tolerant Technology Working Group, composed of specialists in 
tault-tolerant computer technology from academic and independent research institutions; 
and the Academic .\ssessment Commit tee. comprised of leading researchers from academic 
iind independent researdi institutions. 


TIu'.se groups were brought together m a senes of three workshops held at JPL in May. 
Jul> . and August I9.S0. under the guidance of the Study Planning Committee. The 
spaceciatt systems and lauit tolerant working group members presented their organi/a- 
tions current capabilities in spaceciatt and fault-tolerant computers, respectively, from 
wliich a state-ol the-art technical data h.ise was established. A set of conceptual design 
rer|uirements was then developed, detailing what an AS.VI spacecraft must do. Thus, 
knowing on one hand the capabilities of current spacecraft, and. on the other, the require- 
ments loi .-\SM. the Winking groups began a search for tlie opliiiuim plan for the Integra- 
tion ol ,\SM into spacecutl. 

The nia|or product ot the Spacecraft System Technology Working Group was the 
Implenientation Plan, which details the group's recommended approach for incorporating 
.ASM capabilities into operational spacecraft by l'>«U, The Fault-Tolerant Technology 
'Vorking Gioup and the Academic Assessment Committee together established the 
Research .Agenda, which outlines basic research activities required to fill technological 
gaps. 


It IS hop.'d .flat tlie material presented here will provide guidance for the evolution 
of to;;„e spaceciatt systems The study participants believe that the interaction between 
the woikiiig groups has been synergistic, and has contributed to an increased awareness 
Ot potential teelinolo^v capabilities. 
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This report outlines a plan to incorporate autonomous spacecraft maintenance (ASM) 
capabilities into Air Force spacecraft by 1989. These capabilities include the successful 
operation of the spacecraft without ground-operator intervention for extended periods of 
time. Autonomous maintenance requires extensive use of onboard fault detection, isola- 
tion, and recovery mechanisms integrated into the spacecraft within a hierarchical archi- 
tecture. These mechanisms, along with a fault-tolerant data processing system (including a 
nonvolatile backup memory) and an autonomous navigation capability, are needed to 
replace the routine servicing that is presently performed by the ground system. 

As part of this study, the state-of-the-art fault -handling capabilities of various space- 
craft and computers are described, and a set of conceptual design requirements needed 
to achieve ASM are established. From these two inputs, an implementation plan describ- 
ing near-term technology development needed for an ASM proid-of-concept demonstra- 
tion by 1985, and a research agenda addressing long-range academic research for an 
advanced ASM system of the 1990s, are established. 
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Executive Summwy 


I. Introduction 


Spacecraft are presently designed to interact with the 
ground-control/operations center for routine maintenance and 
for fault diagnosis and reconfiguration iit the event of onboard 
problems. During periods of contlict. however, the control 
operations center is vulnerable to hostile action Jo continue 
operation during these periods, spacecraft 
autonomously performing predetermined ground 
is accomplished by autonomous spacecralt maintenance 

i \SM) asm maintains the spacecralt 111 a state ot rea ines.c y 

providing spacecraft designs that require no ground contact 
fnUMaction for onboard detection, isolation, and recovery 
from faults <n for routine operations such as power manage 

ment 

The studv group was commissioned to determine a way to 
incorporate ASM into spacecraft To do tins, they first mje a 
stateot-thc-art technology assessment ot curient spaceaj 
systems, and then deteriiiined some general requirements lor 
an .ASM spacecraft. From this, they developed 
mentation Plan that leads to the incorporation ojASM 
operational spacecraM by Included in this was an 

rdentitication of the needed technologies to till imiiHdMtc 
oaps To address longer-term technology issues tor use in a 
t:vorid.generat.on ASM spacecraft of the D>'-Os. the study 
group also developed the Research Agenda 


II. st«t*of*th**Aft Technology 


The meniK-rs of the working groups presented examples of 
curlnt sjacecraft and f mU-tolerant computer systems. 


describing their fault-handling characteristics. Examples of the 
S'S presented were. FLTSATCOM and LEASAJeom- 
munications). Global Positioning System (navigation). Defense 
Meteorological Satellite Program 

Multimission Modular Spacecraft (multimission). and Voyage 
(planetary exploration). Although each of these spacjral 
triform 'some functions autonomously, none is '^^Pab^ o 
fully autoi inious operation, mainly because this capability 

computers described were: Fault-Tolerant Spaceborne Coni- 
puter and Building Block Fault-Tolerant Computer (space- 
Lrne): C.mmp. Cm*, and C.vmp (commercia ): Software 

implemented Fault-Tolerance and Fault-Tolerant Mulupro- 
cesLr (commercial aviation). Of the fault-tolerant computers 
presented, none is operational yet. and only the Ju t-To eran 
Spaceborne Computer and Building Block 
Computer will be applicable to spacecralt sy ste ms Jhe others 
hoZer provided examples of design methodologies and 
techniques that may be applicable to spaceborne computers. 

Each of the spacecraft that was presented required interac- 
tion with the ground sysleni for normal operations manage- 
Tm. L well a' fault diagnosis and recovery. Thismterajon 
was needed for such things as power management. 

navigation corrections, and any abnormalities that 

n»,„g .IK- S,.»cca... .oly on "7"' X'T.t 
center makes the overall space system as vulnerable as the 
control operations center. It also creates a long ••down time 
wherever a fault occurs, because the fault must be diagnosj 
and reconliguration commands must be developed by the 
controroperations center. This vulnerability and down tune 
can be reduced bv shifting the management of routine 
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operations and ^ault handling from the ground to the 
spacecraft (i.e., the spacecraft performs them autonomously). 

Thus, the need for ASM is recognized; furthermore, the 
study group believes that the technology available in today’s 
spacecraft systems is a good foundation from which to 
proceed to ASM. 


IH. TIm ASm-EnhanoMl Syttem 

The impact of ASM on future Air Force spacecraft is based 
upon an analysis of the current system's ability to meet the 
candidate design requirements formulated by the study group. 
In summary, these ASM conceptual design requirements are: 

(1) The ASM spacecraft shall optnate without ground 
intervention for up to 60 days with no performance 
degradation, and up to 6 months with degraded, but 
acceptable, performance. (The actual periods of auto- 
nomy may vary with different mission applications: 
however, the participants felt that these were worth- 
while goals for this study.) 

(2) ASM shall not reduce the spacecrafts' performance or 
design lifetime. 

(3) The ground segment shall always be able to override 
ASM actions and interrogate the spacecraft for fault- 
management data (audit trails). 

Satisfying these design requirements implies tlie movement 
of routine maintenance and operations from the ground 
segment to the space segment. The control operations center 
will assume a supervisorv role, potentially less complex, while 
the space segment will become mure complex. The resulting 
benefits of ASM would then include: ( 1 ) reduced system 
vulnerability because the spacecraft is no longer dependent 
upiMi the cvmiroroporaiions center and (2) taster recovery 
from taihires (seconds instead of hours or possibly days). 

The impact ol .ASM on spacecraft design is expected to bo 
cvi>hitu>nai\ Iraditional subsystems ate expected to be 
augmented by two new subsystems: a tault-tolcrant data 
piocossing subsvstcm \Mtli mmvolatile back up rnemoiy, and 
an autonomous navigation subsystem. 

The system areintccluio is expected to possess a “layered” 
fault-protection scheme, enabling fault containment at the 
lovsest possible level \o minnm/e suhsvstem interdepondencios. 
In this scheme, individual subsystems, under system CiUilroi. 
will be lequned to diagiu'so loeal failures and take ciuicetive 
action The system will be required to diagnose and coiicct 
ambiguous failures wiilun the subsystem interfaces and .ASM 
mechanisms themselves, as well as to )udicKuisly allocate the 
svstein rest>uices 


IV. knpltmtntition Plan 

The Implementation Plan focuses on near-term industrial 
technology development and, most importantly, the earliest 
possible system-level proof-of-concept demonstration (1985) 
to support a 1989 launch. The plan stresses delivery of 
“product” in a steady stream from subsystems to a complete 
system for the System Program Offices’ consideration and 
introduction into flight programs. 

As shown in Fig. 1, the Implementation Plan consists of 
four major tasks. These are: (1) redesign of existing subsys- 
tems to characterize and demonstrate ASM capabilities; 
(2) design, develop, and test an ASM system demonstration 
breadboard to show that ASM is a viable concept ; (3) perform 
applications research required to develop an autonomous 
navigation capability and a fault -tolerant data processing 
capability to fill existing technology gaps; and (4) basic 
research needed to develop a second-generation ASM system 
for the l^Ws. A section of this report, “Research Agenda," 
elaborates upon Task 4. 

A budgetary resource estimate for the proposed ASM 
program is S36.4M (FY80 dollars) over five years. For several 
reasons, this figure should be considered only an estimate. 
First, the cost of developing the new technology is not well 
known. Second, a specific mission application has not been 
assumed, and so candidate spacecraft could not be assumed. 
Finally, substantiating data was not provided by the industrial 
participants. For these reasons, a more definitive cost study 
should be performed in the initial phase of the activity. 

V. Research Agenda 

The Research Agenda proposes basic research that is a 
sy nergistic part of the .ASM program. Future ASM develop- 
ment activities are focused on five areas: ( I ) very-large-scale 
integration (VISI) technology, which includes self-testing 
VLSI and on chip redundancy :( 2) system architecture, which 
addresses spacecraft organi/ational issues, architectural devel- 
opments, and advanced system studies; (3) software fault- 
tolerance. consisting of sy.stem partitioning and interface 
definition . self-checking tlighl software, and tault-tolei ant 
software; (4) modeling and analysis, comprised of experi- 
mental testing, statistical imKlcling, and functional descrip- 
tion. nu^deling. and verification ; and (5 ) supporting develop- 
ments needed to formulate an ASM data base, and \o build an 
ASM spacecraft laboratory . 

VI. Conclusions and Rscommsndations 

Tlie following conclusions and recommendation are those 
of the study group participants, resulting from analysis t>f the 
material developed during tlie workshops. 
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A. CondutloM 

(!) ASM, fully implemented, would reduce space system 
vulnerability by eliminating spacecraft dependence on 
the control/operations center for up to 6 months at a 
time, 

(2) The ASM capability need not impose operational com 
strainls on the system user; it must be “transparent 
to the user during normal system operations. 

(3) ASM would require a change in the conduct of opera- 
tions and control, from dependence on a man in the 
loop to dependence on machines for fault handling 
and routine maintenance operations. 

(4) ASM would increase the spacecraft complexity, 
therefore, new methods for specifying, testing, and 
validating ASM-enhanced spacecraft are needed. 

(5) A more effective means of transferring technology 
from research to applications programs would be 
required so that spacecraft problems could be solved 
with the latest available technology. 

(6) New technology developments would be required: 
needed are a higlily reliable fault-tolerant data pro- 
cessing system with nonvolatile backup memory to 
enable autonomous maintenance, and an autonomous 


navigation capability to enable independence of 
routine ground operations. 

(7) ASM would be a phased program; spacecraft would 
not instantly become totally autonomous. The pace 
of ASM development would depend on the resources, 
technology, and chosen program applications that are 
available. A strong corporate commitment to ASM by 
the Air Force, along with a willingness by industry to 
assimilate ASM, would be required to make ASM 
successful. 

(8) Confidence in ASM must be instilled by creation of 
a systematic modeling, analysis, and demonstration 
program. 

(9) Although considerable technology developments are 
necessary, no requirements tor technology break- 
throughs have been identified. 

(10) In the opinion of the study group, ASM is a viable 
concept. 

B. Recommendation 

The study group recommends that the ASM research and 
technology development activities, as outlined in the Imple- 
mentation Han and Research Agenda sections of this report, 
he initialed as soon as possible. This would enable the earliest 
possible spacecraft system-level proof-of-concept demonstra- 
tion of ASM. 
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Fig. 1. impiMtfiMlon plan outtiM 
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Introduction 


Currently, when certain critical failure states are delected, 
spacecraft usually enter a -safe-hold" mode: m ^ 
operations are suspended and ground intervention in the torm 
of reconfiguration commands is required to restore normal 
operations Spacecraft are also not presently designed to auton- 
oLusly recover from design faults, software failures, or 
changing environmental conditions. 

Autonomous spacecraft maintenance is characterized by: 

(1) Spacecraft design that tolerates hardware and software 
failures and design faults. 

C) spacecraft design that requites for extended periods 
of time virtually no ground contact/interaction for 
onboard detection, isolation, and recovery ot faults, or 
routine maintenance functions. 

For the most part, such capabilities have been beyond tlie 
state of the art of spacecraft systems. This study group has 
been commissioned by the Air Force to address 
and to detail a plan leading to the incorporation ot ASM in 
v^peraiumal spacecraft by 1989. 


I. Current space Sytlwri* 


The space system is composed of the space segment imd 
the ground segment, as illustrated in Fig. 2. For this study, the 


space segment consists of only the spacecraft, while the 
grLnd segment consists of three separate en .ties: data 
processing stations, communications centers, an ^ 
operations center. The data processing stations and the corn- 
munications centers may be numerous and are payload-data 
Lrs only, whereas the control/operations center is nonre- 
dundant and is responsible for the overall management of the 
Spacecraft. 

II. Definition of the Problem 

It can be seen that the loss of any single data processing 
station or communications center will not jeopardize the space 
segment , on the other hand, the loss of the control/operat ons 
center may eventually render the entire space system ineffec- 
five The dependence of the spacecraft on the control'opera- 
tions center and the vulnerability of the center to hostile 
action and operator error are the concerns ot this study. 

HI. Scope of tho Study 

Spacecraft autonomy involves several elements: autono- 
mous spacecraft maintenance, autonomous mission sequencing 
and control, autonomous navigation, and autonomous payload 
data processing. To have a completely autonomous spacecra t. 
all of these elements would have to be included. This study, 
however, was to address only the spacecraft maintenance 
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State^f-the-Art Technology 


The members of the Spacecraft System Technology and 
Fault-Tolerant Technology Working Groups presented descrip- 
tions of the faull-handlmg characteristics of existing spaceciatt 
and fault-tolerant computer systems. These were repiesenta- 
tive of successful systems designed to operate within speciln. 
environments; the spacecraft systems for the space environ- 
ment and the computet systems (to date) within laboratory 
environments. In this section, a summary ol the current 
capabilities of the spacecraft and fault-tolerant computer 
systems is given, followed by an assessment ot then televaiKc 

to ;\SM 


I. Spacecraft 

All Foice satellites can be categoii/ed into tout mission 
classes comiminications. navigation, mcieoiological, and 
surveillance During the workshops, satellites troineacli ol the 
classes except surveillance were presented Because surveillance 
satellites were classified, no detailed mlormation was solicited 
Additionally, presentations were given describing some ol the 
planetary exploration spaceeratt 

The members of the Spacecraft System Technology Work 
ing Group were chosen because of their expertise in the 


subject held and because of their affiliated organi/ation's 
experience as a supplier of Air Force spacecraft tor one or 
more of the mission classes. Each member described exanp es 
of current spacecraft, explaining its design methodology 
relating to fault handling. Numerous systems and subsystems 
were described by the participants, the systems presented in 
this section are examples only, representing the dilferent 
mission classes. It is not being suggested that these spacecratt 
are leading candidates for their mission application. Such an 
assessment was not a part of the study. Ibe fault-handling 
characteristics of the following spacecraft will be described. 


Mission class 


C'ommunicalions 


Navigation 

Meteorological 


Muitmnssion 


Planetary c\plorain>n 


Example spacecratt 

FLTSATC OM. LEAS AT 

Global Positioning System 

Defense Meteorological 
Satellite Program 

Multimission Modular 
Spacecratt 

Voyager 
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A. Cxamplt SpMMraR 

I. CommiakaHoM n wwcwft. FLTSATCOM U a 3-txi$ 
stabilized. 23-channel communications satellite. It flies in a 
geosynchronous orbit and has a S-year design life. Four of 
these spacecraft are operational; the first was launched on Feb- 
ruary 9. 1978. Its fault-handling characteristics are given in 
Table 1. 

TaMa 1 . R.TSATOOM laun-handinQ characSartetSca 


Fiiult-lolerant Description 

Attribute 


Reliability achieved by redundant components 
Cross-strapping 

Onboard switching to “safe-hold** mode 
Vndervoltagc detection resulting in automatic 
load shed 

Battery cell monitor and switching 
(\>mmand receiver toggle 

Onb..,.ra 

sottware 

Allow ground intervention tor failure anaKsis 
i;ri>und and switching 

assisted Redundancy in an age men! on ground 

(iround override capability 


U'ASAT is a spin-stabiliicJ geosynchronous conimuni- 
cations satellite designated as a t'unctional tollow on to 
FITSATCOM. with a design life of 10 years. Four satellites 
will be shuttle-launched beginning in 1984. These satellites’ 
fault -handling characteristres are given in Table 2. 

TaM*2. LEASAT fauN-hwtdNng ctif uc tUTtu t l o 


Description 


Automatic transfer to rate hold nu»de in event ot 
Ittss of sensor 

Autoruatually activates redundant contri>l 

ekctronics/nu>tor driver and motor m event 
of loss ot despin control 
No single-point failures in thruster opctatUHis 
Automatic tault dcteciion and ground alerting 
Redundant elements to unit level 
Krceiver time out 

Watch dog tuners 

N\>ne 


Allow ground intervention tot tailuie analysis 
and switching 

Redurulant switching \'oi hatierv charge rates and 
batterv reconditioning 
Redundancy management im ground 


f ault tolerant 
attribute 


Unbourd 

h.iidwaTC 


Onboard 
softw .ire 

(iiound 

assisted 


Onboard 

hardware 


2. Nivigition spRCCcraft. The Globtl Positioning System 
(GPS) satellite is a 3-axis stabilised, semisynchronous (1 2-hour 
orbit) navigation satellite. It will enable a user to accurately 
determine his position, velocity, and time. When fully opera- 
tional, there will be 18 satellites on orbit. To date six have 
been launched, the first on February 22, 1977. Each satellite is 
designed for a mean mission duration of 5 years; the fault- 
handling characteristics are given in Table 3. 




I AUlt-lolerant Description 

attribute 


Onboard 

hardware 


Onboard 

software 


Ciround 

assisted 


Full redundancy except where impractical 
(e g., structure) 

Multiple redundancy in critical subassembbes 
(e.g.. triply redundant atomic clocks) 
Automatic detection and isolation 

l lectrical shorts, attitude loss handled by 
load shedding 

Jet runaway handled bv watchdog logic 
Automatic detection and correction at unit level 
lor system jwrformance degradation failures 
tarth sensor 

Control Wectronics Assembly power supplies 
Masking of solar array system t>crformance 
degradation failures 

.Automatic Sun reacquisition from eclipse 

No spacecraft bus software 

.Allow ground intervention for failure analysts and 
switching 

Redundancy management on ground 
Battery reconditioning 
Routine health and status monitoring 
tphemeris and time update 
Magnetic momentum dump 


3. Mftcorological spacrenift. Defense Meteorological Satel- 
hte Program (DMSP) Block 5D spacecraft are 3-axis stabUized 
and operate in a Sun-synchronous ptdai orbit at 830 km 
(450 nnu) The BI.Kk 5D spacecraft have a 2-year design life, 
the first was launched September 1 1. I97(>. The tault -handling 
characteristics of these satellites are given in Table 4. 


4. Multiminion spacecraft. The Multimission Modular 
Spacecraft is ,t-axis stabilized and can he used lor various 
mission classes. It can be used in orbital altitudes from low- 
Harth i»> geosynchronous. The first launch was February 14. 
1980 It has a 2 year mission lifetime, and is capable of being 
resupplied hy the shuttle. Its fault handling characteristics 
are given in Table 5. 
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Im M H and H ng u m t i— 


Fault-tolerant 

rtftrihute 

Description 

Fault-tolerant 

attribute 

Description 

Onboard 

hardware 

Hardware watchdog timer requires periodic 
response, protects againstjoss of power and 
clock or system lock-up Q) 

Hardware testing of parity, illegal instructions, 
and memory addresses (computer 
self-tests) 

Physical and functional redundancy in 

Onboard 

hardware 

Block redundancy 

No credible single-point failures in space- 
craft bus 

Computer failure detections (watchdog 
timen) in the attitude control, commu- 
nications. and power modules; reconfig- 
ures spacecraft to power safe Sun-pointing 
mode using analog backup system 


subsystems 

Hardware detection/switching in power 
subsystem 

Redundant central processing units 


Undervoltage detection and saftng 
Battery state-of*charge calculations 
Computer self-test 

Spacecraft off-pointing detection and satuig 

Onboard 

software 

Protective software in prwer subsystem 
Solar array drive control 
Battery state of change, low voltage. 

temperature checks 
Load shedding in event of fault 
Software response to errors tested in above 

Spare memory with special software packages 
to anticipate recovery after memory tailures 
Detection /switching in subsystems other than 
power 

Onboard 

software 

Telemetry data quality checks 
Internal validity checks for attitude deter- 
mination and control sottware 
Monitor health and safety of predetermined 
payload instruments with onboard 
safing actions 


Ground 

assisted 

'.Allow ground i.itervention for tailure analysis 
and switching 

Power regulator failure detection and 
corrective acMon 

Redundancy management on ground 


.\llow ground intervention tor tailure analysis 
and switching 


Ground 

assisted 

Special onboard processor test, memory 
patterns, diagnostic instruction test 
Reprogram computers 
Data trend analysis 

tault-hnndHng charactnrlntlc* 



Fault-tolerant 

attribute 

Description 

5 Planetary exploration spacecraft. The two 3-axis stabi - 
ued Voyager spacecraft. launched August 20 and Septeinber 5. 
1977 are designed to explore the planets Jupiter and &turn. 

e! h icci was fo, . 4.y», mi»»n 

«ch h« .nou,l, «perJaW« f... pos»W. exenW 
mission Table 6 lists Voyager's fault-handling characteristics. 

Onboard 

hardware 

Overtemperature protection tor payload 
instruments 

No significant single-point tailures 
Block and functional redundancy 
Computer self-test 

Nonvolatile mcmoiy tor Computer C ommand 
Subsystem and Attitude and Articulation 
Control Subsystem 


rndervolla|£<; protection 


B. Fault-Handling Daaign Faaturaa Of Spacacralt 

Several observations can be made from the fault-handling 
characteristics of the spacecratt presented. T e 
tvptcallv employ block and functional redundancy for h.^^ 
rehabihtv, as well as watchdog tuners, cross-strapping, a i. 
w ch^i networks for fault protection and sel.-preservatiom 
n general, there are no credible single-point talk res. The 
Iroimd-assisted features include such capabilities as ephert^ru 
and time updates, trend analysis, and mission reconfiguration^ 
Redundancv management is done mostly on the ground, and 
in all cases the ground has an override capability. 

Block redundancy employs complete, identical, extra 
cotnponents that can take over in the event ol a component 


Purity and code checks 
Restores command link 
Switches processors 
Switches power elements 
Switches Sun/star sensors 
Onboard Switches thrustcrs/plumbin^ 

sidtware Reprogrammable 

J vent timmji and event ci>untin^ 

Retry tor ssmie data iransrnissum errors 
Bli»ck pantv validation t>t command 

sequences 

_ „ redundant components with 

Ground noncatastrophiL tailurc mi>dcs 

assisted .\Iternate ^iperatintt mt^des 

Trend analysis and calibrations 







failure. There are several levels at which block redundancy can 
be applied. In ascending order these are the element, func- 
tional unit, functional string, sub^stem, and system levels. 

Functional redundancy, on the other hand, does not 
employ identical components, but instead performs nearly the 
same functions using alternate system or subsystem configura- 
tions, typically controlled from the ground. Functional 
redundancy has an advantage over block redundancy in that it 
helps avoid systematic design errors; however, it generally 
does not possess equal performance capability. 

In the event of a failure, the operating philosophy for the 
spacecraft systems has been to rely on ground interaction to 
restore successful operations. This has given rise to the safe- 
hold mode in which the spacecraft is autonomously switched 
to a benign state until ground interactions restore operation. 
The ground action typically involves fault detection through 
analysis, commanded switching to isolate the defective ele- 
ment, and finally recovery procedures through reconfiguration 
of available resources. 

C. Cumnt Design Itethodologiss 

Methodologies have been defined to include procurement/ 
management policies and design/developmeni procedures. The 
procurement/management policies structure the development 
process througli the use of formalized management reports, 
design reviews, and audits. Design/developmeni procedures 
refer to the collective set of design tools and test and valida- 
tion procedures that are employed dunng the development 
process. 

Design tools are employed to evaluate the adequacy of a 
proposed design prior to a commitment tor fabrication. Such 
tools include simulation, emulation, and reliability analysis 
techniques (defined by MIL-IIDBK-2 1 7). Testing procedures 
strive to show that the spacecraft operates per the design 
intent: they should identify faulty components and errors in 
manufacturing. Validation programs, on the other hand, strive 
to insure the agreement of the system realization with the 
system specification. This includes validation of performance, 
reliability, and environmental requirements. 


II. Fault-Tol«rant Computing 

An assessment ot the stale of the art in fault tolerant 
computing was undertaken as part of the ASM study for two 
important reasons. First, it is a technolog> that has been under 
investigation for over twenty >ears. and that has resulted m 
the development of several autonomously maintained systems 
(e.g. self-repairing computer systems). Second, it appears that 


onboard fault-tolerant computers will be required to act as the 
automated repairman for ASM spacecraft. 

A number of fa«lt4olerant computers have already been 
constructed and used. The largest api^cation is in telephone 
switching systems. Most modem switching offices are autono- 
mously maintained systems. The resident computer is capable 
of detecting faults within itself a-^d in surrounding equipment, 
replacing the faulty equipment, and continuing normal service 
(Ref, 1). Computers with varying degrees of autonomous 
self-repair have been used in other commercial applications. 
Examples are the Pluhbus Multiprocessor for communications 
systems, and the Tandem Computer Systems often used for 
financial transactions (Ref. 2). In aerospace systems, examples 
of fault-tolerant computing can be found in commercial 
airplanes, the Space Shuttle, and in the Saturn V guidance 
computer (Ref. 3). Thus there exists a large body of design 
experience in the development of fault-tolerant (i.e., autono- 
mously self-maintained) computing systems for a variety of 
applications. 

Although two breadboard systems have been constructed 
and tested, and a third is under development, fault-tolerant 
computing has not been used on current spacecraft. Two goals 
of the Fault Tolerant Technology Working Group were to 
provide a state-of-the-art assessment of fault-tolerant comput- 
ing to the spacecraft systems technologists, and to evaluate 
problems and prospects for employing fault-tolerant comput- 
ing in spacecraft flight systems. Each member of the working 
group is actively involved in the development of a state-of-the- 
art, fault-tolerant computing system, and each made presenta- 
tions on their systems. 

Seven fault-tolerant computing systems were presented. 
They are catagorized into three groups: (1) onboard spacecraft 
computers, (2) avionics computers, and (3) commercial 
computers. 


A. OniMMifd SfiMteraft Computers 

Two computer systems were presented, the Fault-Tolerant 
Spaceborne Computer and the Building Block Fault-Tolerant 
Computer. 

1. Fault-Tolerant Spaceborne Computer. The Fault- 
Tolerant Spaceborne Computer is a general-purpose computer, 
designed to Air Force specifications of high throughput and a 
^5*^^ probability of surviving (unattended and without degra- 
dation of performance) for 5 to 7 years. This machine is cap- 
able of self-reconfiguration and resumption of computations 
following internal component failures, power transients, and 
radiation events. 
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The Fault-Tolerant Spacebome Computer is in an advanced 
state of development. A laboratory breadboard has been 
constructed and the fault-tolerance features verified by exper- 
imental testing (e.g., insertion of faults and verifying proper 
recovery). The machine is based on complen^ntary metal- 
oxide semiconducted-silicon on sapphire LSI technology* It 
is not available for flight use due to an inability to obtain 
radiation-hardened (1000 gate/chip) integrated circuits 
(Ref. 4). 

2. ftiilding Block Fault-Tolerant Computer. The Building 
Block Fault-Tolerant Computer is a fault-tolerant distributed 
computer system architecture. It is aimed at spacecraft systems 
that employ a large number of microcomputers embedded in 
various subsystems, and is an outgrowth of the Unified Data 
System architecture developed at JPL (Ref. 5). This architec- 
ture uses a small set of standard building block circuits that 
allow existing microprocessors and memories to be connected 
together into fault-tolerant distributed computer systems. The 
building blocks connect the central processing unit and the 
random access memory to form self-checking computer 
modules that can detect their own internal faults during 
normal operation. The self-checking computer modules 
contain interfaces to a set of redundant intercommunication 
buses and can be connected into a network in which spare 
computers are employed for fault recovery. 

A breadboard of the Building Block Fault-Tolerant Com- 
puter is currently being developed, and it is expected that it 
will be completed and verified in 1982. Flight availability will 
require the subsequent development of two VLSI and four LSI 
integrated circuits, and will take an additional two or three 
years. The problem of obtaining radiation hard parts is com- 
mon to both the Fault-Tolerant Spacebome Computer and the 
Building Block Fault-Tolerant Computer programs. 


B. FauN-TolBrifit AvionIcB CompiftBrs 

Two avionics computers were presented. These machines 
have been developed by NASA for control of future fuel- 
elTicient aircraft that will be dynamically unstable. Extremely 
high reliability is required since lives may depend on correct 
computer operation. Thus a reliability of 0.999999999 is 
required for every 10-hour tlighi mission. These avionics 
computers are not directly applicable to spacecraft. Weight, 
pi'wer. and volume greatly exceed what can be supported by a 
spacecraft. They are also designed to allow human mainten- 
ance after every 10-hour tliglit when the plane is on the 
ground a condition nt)t experienced by spacecraft 

The two avionics computers are designated Fault-Tolerant 
Multiprocessor and Software Implemented Fault Tolerance 


Both have been developed as breadboard systems and are 
currently under test. Thou^ not immediately api^cable to 
spacecraft, many of the techniques and insights developed in 
their design will be applicable to long-term research into future 
ASM systems. These machines are summarized below. 

1. Fmilt-Toimaiit Multiprocessor. The Fault-Tolerant 
Multiprocessor is intended for use as one of at least two 
central computers in a redundant distributed digital system 
designed to serve as a highly survivable avionics system. The 
design is based on independent processor-cache memory 
modules and common memory modules that communicate via 
redundant serial buses. All information processing and trans- 
mission is conducted in triplicate so that local voters in each 
module can correct errors. Modules can be retired and/or 
reassigned in any configuration. Reconfiguration is carried out 
routinely from second to second to search for latent faults in 
the voting and reconfiguration elements. Job assignments 
are all made on a floating basis, so that any processor triad is 
eligible to execute any job step. The core software in the 
Fault-Tolerant Multiprocessor will handle all fault detection, 
diagnosis, and recovery in such a way that applications pro- 
grams do not need to be involved (Ref. 6). 

2. Software Implemented Fault Tolerance. Software Imple- 
mented Fault Tolerance is an ultrareliable computer for 
critical aircraft control applications that achieves fault toler- 
ance by the replication of tasks among processing units. The 
main processing units are off-the-shelf minicomputers. Fault 
isolation is achieved by using an individually-buffered, serial 
data link between each processor pair for all processors. Error 
detection and analysis and system reconfiguration are per- 
formed by software. Iterative tasks are redundantly executed, 
and the results of each iteration are voted upon before being 
used. Thus, any single failure in a processing unit or bus 
can be tolerated with triplication or quintuplication of tasks, 
and subsequent failures can be tolerated after reconfiguration. 
The Software Implemented Fault Tolerance software is highly 
structured and is formally specified using the SRl-developed 
SPECIAL language (Ref. 7). 

C. FauN^Totorairt Compile 

Three fault-tolerant computing projects at Carnegie Mellon 
University were presented. These systems use DEC minicom- 
puters and are aimed at commercial applications. Though not 
directly applicable to spacecraft systems, some of the insiglus 
gained in their design are applicable to ASM research. These 
machines, designated C.mrnp. Cm*, and C.vrnp, arc summar- 
ized below. 

I. C.mrnp, a multiminiproccssor C mmp is a canonical 
multiprocessor system with a 16 X 16 crosspoint switch. Up to 



10 




16 DEC PDP-11/40 processors may be connected to the 
processor ports on the switch. The 16 memory ports provide 
an address space in shared memory of 32 Mbytes. Any pro- 
cessor can access any of the 16 memory ports for memory 
accesses. The entire set of processors may communicate via an 
interprocessor bus that allows interprocessor interrupts at one 
of four priority levels, continuously broadcasts a 60-bit 
nonrepeating clock value, and allows any processor to HALT, 
START, or CONTINUE any other processor (Ref. 8). 

2. Cm*, a modular muituiucroproGCisor. Cm* is a modular 
multiprocessor system based on the LSMI processor. Each 
computer module is connected via an interface to an intelli- 
gent cluster controller. The clusten of computer modules can 
be interconnected via intercluster buses. Each computer mod- 
ule can sitare memory with any other computer module in the 
network through routing tables in the cluster controller 
(Ref. HI 

3. C.vmp, a voted multiprocessor. C.vmp may best be 
described as a multiprocessor system capable of fault-tolerant 
operation. It consists of three separate LSI-1 1 microcomputers, 
each with its own memory and peripherals. Tltey may run 
independently as three separate computers communicating 
through parallel line units, or they may be switched into what 
is termed voting nu>de under manual or program control to 
form a triplicated LSI-1 1. Tins form of triple-modular redun- 
dancy allows the voted multiprocessor to continue operating 
under the situation where any one out of three copies of any 
triplicated element suffers a hard failure (Ref. K). 


III. Stat«-of-the-Art Ttchnology AstMsm«nt 

In this section an assessment will he made of the applic- 
ahilit) t'f the current spacecraft and fault -toler.int computer 
tochnologN to an .ASM cnhanced spacecraft In general, design 
features will need to he added to the spacecraft \o accomplish 
the new lunctions dictated In ASM. The procurement/ 
management policies are consideied ade<.|uate for .\SM, but 
new design tooU. reliahilitv technKjues. and test/ validalum 
priKodures will be re».juired. 

A. SfMictcrafI Ttchnology AssMtm^nt 

.\s mdkaied m the introdiico-ui. ASM consists ot space 
cr.itf desigm that toleiaie lailuics and that require no ground 
contact mteracti<»n tor extended penods ot time Ihe follow- 
ing as\esMnont »»f *,urtent techm>K>g> is given against those 
.It tributes 

I. Uvvign features In e.ich of the presentations there were 
several methi>Jst»t fault detection. is«>lation. and recovery that 


were common to all the spacecraft. The methods utilized in 
the design and implementation have evolved in parallel with 
the spacecraft requirements. Aa requirements for long life and 
high reliability have become more stringent, specialized 
functions have evolved, with satisfactory in-flight experience 
serving as the basis for broad acceptance. Typical of the 
specialized techniques em^rfoyed by ^acecraft to protect 
against specifle fault classes are: cross-strapping, voters, watch- 
dog timers, parity checks, data coding, counters, and switching 
networks. These fault-handling techniques pertain generally to 
subsystems. At the system level, about all that is currently 
done in a fault situation is to put the spacecraft in a safe-hold 
mode to await ground-operator command. It is the inclusion 
of onboard detection, isolation, and recovery mechanisms for 
the purpose of reducing all ground interaction that is the 
distinguishing characteristic of ASM. 

a. Detection mechanisms. Present spacecraft design tech- 
niques rely on parametric data telemetered to ground 
operators for fault detection. Generally, faults arc inferred 
from tile nonreal-time analysis of such data. However. ASM 
requires the timely detection of faults by either direct para- 
metric measurement or incipient fault prediction using direct 
measurement and onboard trend analysis techniques. Because 
of mass and power constraints, measurement technology 
becomes a leading technology driver. More extensive use of 
watchdog timers, parity checks, error-coding schemes, and 
counters is anticipated. Concerns about the integrity of 
detection mechanisms, utilizing special test routines and/or 
additional detection mechanisms must also be resolved. 

h. Isolation mechanisms. Extremely high reliability switch- 
ing techniques dominate fault isolation strategies, which 
involve the ability to remove faulty components from the 
system piior to reestablishing a “fault-free", fully operational 
configuration. At issue is the reliability of the switching 
mechanisms themselves. Redundant switching strategies, 
power control, and special test routines to assess switch 
mtegnt> during latent intervals are required. 

( Recoien' mechanisms. Recovery mechanisms tend to 
center upon issues of resource management and techniques to 
m.i\imi/e system performance subsequent to faults. As such, 
they lepresoni a system attribute, whereas detection and 
iNofuion inechamsms are characieri/cd as subsystem attributes. 
.'\ N> stem-level view of the available spacecraft resources 
will he needed, and si) system trade oft studies striving to 
mimmi/e cost (mass, vidiime. power) and maximize recovery 
ptneniial trom faults are required 

2. Spacecraft methodologin In general, the procurement 
management policies have been ci>nsidered adequate by the 
study participants, however, new design procedures are antici 
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pated. The following discussion focuses on the need for new 
i\ 0 *ign took and new test and validation procedures. 

A Design took. Simulaton and emulators will be required 
to provide rdative assessments of the design and to assist in 
trade-off evaluations. Present reliability methods, however, as 
defined by MIL-HDBK-217 may not be directiy applicable to 
ASM. Areas requiring further study include; 

(1) Software and firmware reliability. The problem in- 
creases with the complexity of the software, and total 
project orientation is needed for success. 

(2) Predictive methodology for transient failure analysis. 
Test data on commercial computer systems presented 
during this study indicate that as many as 50% to 90% 
of reported faUures result from transient faults. 

b. Testing. The present testing techniques have been shown 
to be adequate for current spacecraft. For an ASM spacecraft, 
however, new testing methods may be required because 
(1 ) testing of ASM functions must be done at each step in the 
integration process; (2) new onboard capabilities may require 
new test equipment; and (3) the possibility ot ASM masking a 
failure prior to launch must be detected. 

c. Validation. A major element of validation, specilically 
relevant to ASM. is reliability validation. As noted at a NASA 
conference on validation methods research for tault-tolerant 
avionics and control systems (Ref. 9). 

“A traditional approach to reliability validation is 
the life testing method in which one takes n 
statistically identical copies of the system under 
test and terminates the test after r (1 < r < n) 
systems have failed. Using the accumulated time 
on test, one can derive a point estimate and con- 
fidence intervals for the mean life ot the system. 

These statistical techniques also allow one to cal- 
culate confidence intervals for system reliability 
for any given mission time." 

>• the number of systems required to be put 

under test increases monotonically with the reli- 
ability of the system being tested. Furthermore, 
the validation problem is compounded because the 
cost of an individual copy of the system also 
increases remarkably with its reliability. 

•• applying traditional lifetesting techniques 

implies unreasonably high validation costs. 

The conclusion of the NASA workshop is that a new valida 
lion methodology is required for fault-tolerant avionics and 


control systems. This conclusion is also appropriate for long- 
lived, hi^y reliable spacecraft systems. 

B. FauH-Tolfwt Computing TtctwiolOBy At mmimw* 

The following conclusions summarize the state of the art 
in fault-tolerant onboard ermaputers, and the applicability of 
extending the methodology of fault-tolerant computing to 
ASM. 

1. availalrility. No fault-tolerant computers are 

currently available for use on ASM spacecraft. The Air Force's 
Fault-Tolerant Spaceborne Computer is the most viable candi- 
date for use in a 1985 ASM demonstration, since it is the only 
fault-tolerant onboard processor in an advanced state of devel- 
opment. A breadboard has been constructed and verified. The 
major obstacle to its use is the development of low-power, 
radiation-hardened LSI. This is an enabling technology for all 
advanced digital systems in USAF spacecratt and is being 
treated as a problem of high priority and urgency. 

The Fault-Tolerant Spaceborne Computer may be ham- 
pered, however, because it is implemented as a single uni- 
processor. It is expected that future spacecraft architectures 
will tend toward a proliferation of small microcomputers in 
a variety of control and payload subsystems. Fault tolerance 
will need to be distributed throughout these distributed 
architectures (e.g.. special fault-detection hardware will be 
required in each small subsystem computer), and a hierarchy 
of recovery mechanisms will be employed. Therefore it is 
important that fault-tolerant distributed computing systems 
be developed for future generations of ASM spacecratt. The 
Building Block Fault-Tolerant Computer is a distributed 
computing system being developed toward that objective. It 
is not as far advanced in development as is the Fault-Tolerant 
Spaceborne Computer, but it may be available as an alterna- 
tive flight system in the future. 

2. Fault-tolerance methodology. Many of the techniques 
employed in fault-tolerant computer design can be extended 
beyond the computing subsystems to the ASM spacecraft 
system. This has already been demonstrated to a considerable 
extent ten years ago in an ASM study of the NASA Tliermo- 
electric Outer Planets Spacecraft (Ref. 10). Some of the fault- 
tolerant computing methodologies that can be applied to 
spacecraft are listed below; 

(1) Cartful definition of fault set: In both digital and 
spacecraft systems, it is necessary to carefully define 
and analyze ih * fault conditions. 

(2) Tatdt-detection algorithms: Following a careful anal- 
ysis of faults. It is necessary to determine the mecha- 
nisms by which they are detected. In both digital and 
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nondigital subsystems, this takes the form of special 
sensing hardware and software. 

(3) Fault containment: To simplify fault recovery, in both 
computers and spacecraft, it is necessary to design the 
system so that the spread of damage caused by a fault 
is minimized. Whenever possible, it is advantageous 
to detect and contain faults at the lowest possible 
level. 

(4) Hierarchic fault recovery : Fault detection and recovery 
in computers is done in a hierarchic fashion. Recovery 
may be implemented at various system levels depend* 
ing upon the origin and severity of the fault. This meth- 
odology dearly applies to spacecraft systems as well. 

(5) Reliability modelitxg: Reliability and performance 
models developed for fault-tolerant computers are 
applicable to ASM spacecraft. The concept of “cov- 
erage". which describes the effectiveness of the fault 
recovery mechanisms, is ver>^ important in both com- 
puter and spacecraft systems. Extensions of existing 
reliability and performance models for computers are 
recommended for spacecraft evaluation. 

(6) Validation. Current work on tlte validation of fault- 
tolerant computers will be applicable to spacecraft 
systems. Fault-tolerant computers and ASM design 
should make it much easier to verify the integrity of 
the fault recovery mechanisms without inserting faults 
into the system. Techniques for and results of experi- 
mental testing of fault -tolerant computers will he of 
considerable value to ASM spacecraft engineers. 


(7) Resource management : In complex computing systems 
and in spacecraft there is a resource manafement prob- 
lem associated with fault recovery. As an attrition of 
resources occurs due to faults, the system must op- 
timally allocate those resources remaining. 


IV. Summary ObMTvations 

Reduction of space system vulnerability can be achieved 
by moving the conirol/opcrations center functions on board 
the spacecraft. To do this, an autonomous spacecraft mainte- 
nance capability is required that (1) incorporates design 
features that permit the spacecraft to tolerate faults and 
{2) eliminates the need for routine ground contact. The mili- 
tary spacecraft are currently designed for ground-controlled 
maintenance, and in terms of the ASM capabilities described 
above, they cannot now autonomously maintain their own 
health and welfare. The planetary spacecraft described are 
a step closer to the goal, but are not there themselves. Thus, 
although some pu>necring work in ASM has been done, it is 
still in its infancy. In addition to the enhancement of the cur- 
rent capabilities that have already been mentioned, the suidy 
group foresees two major technological developments that arc 
needed to enable ASM. These are (1) a fault-tolerant data 
processing system and (2) an autonomous navigation capa- 
bility (to reduce the dependence on the control/operaliv)ns 
center). The study group is unanimous in its assessment that, 
with these developments, the ASM capability can be made 
available. 
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The ASM-Enhanced System 


I. Candidate Design ItequIreiiMnts 

Concurrent with the state-of-the-art spacecraft system and 
fault-tolerant computer assessments, the study group deter- 
mined a set of ASM system requirements to convey ASM 
attributes, so that a spacecraft concept could be established. 
The impact of these requirements on future Air Force space- 
craft systems was then anaiy/ed. Following are the candidate 
conceptual design requirements developed from this study. 

( 1 i All Air h(mr s/^aarrafr launduti after \1arch 19S^ 
ahall rmrt the ASM reiiniriments listeJ /)t7(>u On 
this date, the Department ot' IXMense would require 
all subsequent spacecraft purchased tv> include the 
fully operational ASM capability 

(Prior to thiN date, it is desirable to add incremental 
ASM capabilities, consistent with system perfor- 
mance. as they are developed > 

C) The AS\f spaarraft shall operate witluuit a ari>i4rul 
support o>nrn>/ Imk for up to 60 days without de^ra 
dation ttf perfdrmanee This is the essence of autom>- 
nu'us opcrjtiiuis The spacecraft will function until 
ground support is available or desirable frimi the 
viewpoint t>f the ground support team 


(3) The ASM spacecraft shall operate with not more than 
lO^c degradation of key functions over a 6ononth 
period of autonomy. This requirement will set some 
sizing constraints, such as data storage, and require 
some definition of loss of performance. It stresses 
the need for continuous function of the spacecraft 
on an “ad hoc'* basis if scheduled ground support is 
not provided. Tlie 10^ figure is somewhat arbitrary; 
however, at the end of 6 months, the performance of 
the entire system sliall be at a useful level. 

(4) The ASM spacecraf t shall interact with the ground 
support segment for ni)t more than ^0 minutes 
perf>rm all rcipdred suppi^rt fun 'tions w'ithout per- 
fiyrmauee degradation After a period of autonomy . 
It IS required that the spacecraft and ground support 
perform all the required support functions in tins 
window. The functions include (a) downlink id all 
stored maintenance history, (b) uplink of all data 
load (such as star tables and ephemensh {c) redun 
dancy management, and (d) testing. Specification o\ 
the duration of the support window is mission 
dependent. The intent would be an uplink support 
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period approximately the same as that required for 
non* ASM spacecraft. 

(5) ASM shall not change the design lifetime of the 
spacecraft. The imposition of the requirement for 
ASM on a spacecraft development is in addition to 
mission-imposed requirements, particularly the design 
lifetime. ASM will impact the design methodologies. 
Such design issues as depth of redundancy must take 
into account the rate at which resources are used up 
with the ASM design so that tlie total lifetime or 
mean mission duration shall not be reduced. 

(6) ASM shall not change the performance of the space- 
craft or its payload. All requirements placed upon 
the spacecraft development for performance of 
either spacecraft or payloads shall not be affected 
by the presence of autonomous spacecraft mainte- 
nance. The spacecraft must be designed to provide 
these performance levels in the absence of frequent 
ground control interaction. Specific additional 
spacecraft functions, such as navigation, may be 
required to meet the autonomy requirement. If so. 
the performance of these functions (e.g.. navigation 
accuracy) must support non-ASM system perfor- 
mance requirements. 

C) Ihc ASM spacecraft shall be able to recover from 
failures that have been defineJ a priori, and the 
pn)hability that any particular failure was defined a 
prii)h shall he > The ASM functions include 

momiormg llie spacecraft performance for faults and 
problem s\mpioms. and, in the presence of a fault, 
identifying, isolating, and implementing the recovery 
mode at both subsystem and system levels. The a 
prion analysis shall be sufficiently complete tliat, 
during the lifetime of the spacecraft, at least ^>8^ of 
the failures (e.g.. where some component has failed) 
will be identified in this manner (the coverage is 
^)8'"). Compound failures wherein multiple symp- 
toms tKcui simultaneousiy or near simultaneously 
during the detection and recovery period can be 
exempted from this requirement. 

{S) lt*lhnung launch, the AS.M spacecraft shall go 
through a pcn>)d im-i>rbit checki>ut and initiali:a- 
th*n of the same Jiiratum as that of a comparable 
non ASM spacecrajt The aiitvinoiny requirements 
diNCussed here are applied li> the operational period 
die spaceciatt. which is deemed to begin following 
tlie iMi oibit clieckout period In the checkout period, 
maintenance will be under ground control, with 
autononunis capabilities turned on or *>ff as appro- 
priate Since the addituin of ASM does add certain 



functions, operating modes, and complexities to the 
spacecraft, these must also be checked out during the 
same period. Following checkout, all autonomy 
requirements will apply. 

(9) The spacecraft shall process and store aU onboard 
mtmagement data required for ground support, and 
shall telemeter the data during the ground support 
periods upon ground command. The capability shall 
handle all necessary data for 6 months. No matter 
how confident designers may be of the maintenance 
capability of the spacecraft, it will be necessary to 
leave a record for ground support (an audit trail). 

Without this information, the ground support func- 
tion cannot evaluate the state of the spacecraft and 
use the record of performance to extend the lifetime \ 

of the spacecraft, develop or implement alternative 
operating modes, or improve future designs. 

(10) The ASM spacecraft shall transmit a message to the 
ground at the first opportunity following any on- 
board fault-management activity. Whenever an 
incident occurs that requires maintenance activity in 
response to failure symptoms, it is important that the 
ground be given the opportunity to review' the action 
and to verify the status and mode of the spacecraft. 

11ms, a telemetry message indicating that some 
activity had taken place would be sent to the ground 
at the first pass over an appropriate ground station. 

This type of signal may be coded into the user data 
to trigger an alarm at die ground support station. 

Sending of the message does not abolish the obliga- 
tion of the spacecraft to retain the data for the 
maximum period, and to continue to operate in an 
autonomous manner for the established periods. 

(11) The ground support shall be able to override ASM 
management activities for the system and the sub 
systems. Wliile the ASM spacecraft shall have the 
ability to perform redundancy management in the 
presence of an apparent fault or problem, it is neces- 
sary that the ultimate control over these functions be 
maintained at the ground, and that the spacecraft 
shall allow for ground communication that overrides 
and can reverse the prior decisions of the ASM func- 
tions. Tlie capability is necessary so that the system 
will he able to recover from such learning curve 
uncertainties as misdiagnosed problems oi design 
n.iws. In this way. non failed components may be 
recycled back into the configuration inventory, or 
the spacecraft alternate modes t)f functioning may be 
utilized to make use of partial capabilities of cumpt>- 
nenis. In terms i>f a hierarchical decision tree, the 
gfi>und support personnel shall occupy the top level 
to rnaxinii/e system performance. 
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(12) The source of last resort for fault isolation and recov- 
try shall be the ground support. Hie ASM spacecraft 
shall be designed to recognize when it has been 
unable to isolate, remove, and recover performance 
following a fault. When this occurs, the spacecraft 
shall take action to protect itself from self-injury or 
dissipation of resources (such as an engine firing limit 
cycle that would consume propellant), and await 
ground intervention. 


Satisfying these design requirements implies the movement 
of the control of routine maintenance operations from the 
ground to the spacecraft. The ground segment will assume a 
supervisory role, always maintaining the ability to override 
ASM actions, but allowing the spacecraft to initially handle 
its own maintenance functions. The space segment, on the 
other hand, will become more complex due to the added 
operations it must perform, including onboard navigation 
(eliminating the need for routine uplink) and fault detection, 
isolation, and recovery. To handle these added operations, a 
fault-tolerant data processing subsystem and an autonomous 
navigation subsystem will be required. The major benefits of 
ASM would then include: (1) reduced system vulnerability, 
because it is no longer dependent upon tlie ground station 
or possible incorrect commiuids by human operators, and 

(2) faster recovery from failures (seconds instead of hours or 
possibly days) because recovery procedures would start 
immediately upon fault detection. 


II. Impact on the Ground and 
Space Segments 

Some examples of operations and maintenance functions 
that presently are accomplished by the ground segment, but 
with ASM will be accomplished by the spacecraft, include: 

(1 ) Attitude/pointing commands 

(2) Thermal control loop 

(3) Power management 

(4) Fault monilor/isolation 

(5) Fault tolerant computation 
(h) Fault switching 

(7) L oad switching 

(8) Trend analysis 

A reduction in ground control activity can clearly he seen. It 
should be remembered, however, that in its supervisory 


capacity, the ground segment will have the ultimate authority 
and responsibility in all situations. As total reduction in 
ground control will not occur at one time, a transition phase 
will be required. This phase will enable: (1) inflight measure- 
ments of effectiveness for ASM over a diverse set of operating 
conditions, (2) the development of understandable and pre- 
dictable ASM operations, and (3) simultaneous support of 
both ASM and non- ASM operational spacecraft. 

The increase in spacecraft autonomy will mean an increase 
in the complexity of the spacecraft. While this increase in com- 
plexity must not introduce catastrophic failures or reduce the 
payload performance, it will tend to increase the spacecraft's 
mass, power consumption, and total cost. Given the study 
group’s knowledge of current and projected technology, the 
following heuristic estimates were established as reasonable 
design goals for an ASM-enhanced spacecraft: 

Power consumption: ASM < lO^^c of total 

Mass impact : ASM < of total 

Cost impact : ASM < I O^r of life-cycle 

cost 


III. A Hierarchical Description of the 
Space Segment 

The following sections describe what the study participants 
believe will be the impact of ASM on a generalized spacecraft 
system. In these descriptions, the following assumptions have 
been made: 

(1) The ASM requirement is added to a new spacecraft 
before design. 

(2) ASM technologies will be available. 

(3) Payload is treated as a subsystem, except for user data 
flo-.. 

(4) As long as mission objectives are met. normal space- 
craft functions may be interrupted during certain fault 
recovery procedures. 

A, SyttMfi ArchH«ctur« 

System architecture evolves from the mission requirements, 
and includes the hardware organization, data tlow characteris- 
tics, and (it a digital system) tlie hierarchical operating sv stem. 
Ihe system must judiciously allocate the available resources 
and, upon command, must report all ASM actions (including 
parametric data) to the control/ operations center. Finally.it 
must also insure its own integrity (througli self-diagnosis) so 
that incorrect actions and ground system lock-out modes are 
eliminated. 
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An example of a system architecture that could be used for 
ASM is shown In Fig. 3. lliis is characterized by both distrib- 
uted and central processing system attributes. Efficient man- 
agement of the spacecraft resources based upon prespecified 
algorithms require centralization of high-lt»el decision making. 
This would be accomplished by a fault-tolerant processor, 
serving as the spacecraft central controller, augmented by 
processors located in each of the subsystems as appropriate. 
In addition to the new subsystems already mentioned, the 
architecture should also accommodate additional mission- 
unique subsystems. 

The system architecture example described above is one of 
several possible architectures for an ASM spacecraft. While a 
detailed investigation of the various architectures was not a 
part of this study, the participants believe that such an effort 


should be undertaken as one of the first tasks of an ASM 
development program. 

Whichever architecture is chosen, the study group believes 
that a “layered” fault protection scheme should be used, 
enabling fault containment at the lowest possible level to 
minimize subsystem interdependencies resulting from fault 
propagation (including data contamination). This fault pro- 
tection scheme is illustrated in Fig. 4. In this scheme, individ- 
ual subsystems, under system control, will diagnose local 
failures and take corrective action. Ambiguous problems result- 
ing ‘‘rom failures within the interfaces between subsystems 
will require diagnostic routines and hardware to pin-point the 
failure. Some unresolved system issues include the problems of 
transients, false failure alarms, multiple faults, and faults 
within the fault-tolerant computing system. Once the system 
has been designed, test and validation procedures must be 



Fig. 3 . EumptoolatyMMnarcMlKturvfer ASMspaoocrali 
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formulated. Finally, there should be a demonstration program 
showing that the requirements for ASM are met without com- 
ptomising either the mission lifetime or payload performance. 
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Fig. 4. **Layrad** ar ch nact u re of fauW proctalng 


ASM will affect the traditional subsystems (attitude and 
articulation control, power, telemetry and data handling, 
payload, communications, propulsion, and thermal control) 
by requiring that they add the capability of diagnosing and 
handling their own faults. The conceptual design requirements 
imposed on the ASM spacecraft, however, necessitate the 
potential addition of two new subsystems. These include a 
fault*tolerant data processing subsystem and an autonomo is 
navigation subsystem. 

The need to integrate independent subsystems with indi- 
vidual processing requirements into a control hierarchy for the 
purpose of managing and reporting fault-protection leads to a 
requirement for a fault-tolerant data processing subsystem. 
Because of the potential for power-interrupt failure modes, 
this subsystem must include limited nonvolatile backup mem- 
ory resources for selected critical program and data storage. 

The requirement for six months of unattended operations 
necessitates an autonomous navigation capability. The prob- 
lems of vehicle position and velocity are dependent upon mis- 
sion requirements for attitude control and pointing. It involves 
the characterization (modeling) of complex gravitational 
fields, including the effects of Earth figure and multibody 
(Earth, Moon, Sun) interactions that perturb the vehicle posi- 
tion and velocity. As altitude control requirements become 
more stringent, more precise models and advanced sensors 
permitting real-time drag acceleration measurements will be 
required to complement existing inertial measurement devices 
and celestial sensors. 

Finally, the requirements for a six-month audit trail and 
onboard trend analysis to permit fault prediction and projec- 
tion necessitates storage and manipulation of a large volume of 
data. Without ground links, the study participants believe 
additional data storage capabilities, coupled through the data 
network to the other spacecraft subsystems, will be needed. 
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Implementation Plan 


I. Introduction 

This section, together with the Research Agenda of the 
next section, describes the study participants' recommended 
plaii of attack to solve the problem that prompted the ASM 
study: to satisfy the requirement for spacecraft readiness in 
the face ot the loss of ground stations. In contrast with the 
Research Agenda that addresses medium- and long-range 
academic research for an advanced ASM system of the 1990s 
this section focuses on the near-term (next five years) indus- 
trial technology development and. most importantly the 
Mrliest possible system-level. proof-«fH.oncept demonstration 
The plan stresses delivery of "product" in a steady stream 
from subsystems to a complete system for the System Program 
Offices consideration and introduction into flight programs, 
n this sense, the plan is a technology program that is managed 
^e a project, with focused goals and milestones to be met 
t^ile there is no provision for a flight demonstration in this 
p an. a detmite goal has been to provide a program that will 
gerierate continuous ASM technology "fall-out." which can be 
utilized in ongoing programs and in design block changes. 

The program described below is preliminary; the limited 
resources of the study precluded a detailed program develop- 
ment and cost estimate. However, the study participants feel 
the proposed plan described contains the essentials of a 
workable program needed to meet the future requirements 
ot the Air Force. 


A. PurpoM 

^e purpose of the plan is to recommend a coordinated set 
Of developments that will give industry a demonstrated capa- 
ility to build an ASM spacecraft, and hence, enable the Air 
force to change from ground-dependent to autonomous 
operational spacecraft by 1989. 

B. Goals 

The goals of the plan are: 

(1 ) To develop an ASM technology and apply it as early as 
possible to existing programs, especially DMSP DSP 
GPS, and DSCS III. 

(-) To develop, by 1985, a demonstrated industrial capa- 
bility to produce autonomous spacecraft, so that the 
first operational launch may take place by 1 989. 

C. Approach 

The approach taken in preparing the plan can be summa- 
n/ed in the following points: 

( I ) Involve as many relevant governmental and industrial 
organizations as possible. This wUI create a broad base 
of ASM experience, design, and methods. 

(-) In support ot the first goal, begin work with existing 
subsystem designs; ASM implications and problems 
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must be characterized, designs and breadboards must 
be modified, and results demonstrated early. 

(3) In support of the second goal, begin work on a parallel 
system4evel analysis, design, and hardware program 
leading to a proof-of-concept demonstration. 


integration, test, and demonstration of the ASM system will 
be performed. Task 3 is a five-year appUcations effort required 
to develop new well-defined subsystem technologies. Task 4, 
through CY85, is performing the basic research for a “second- 
generation” ASM system as mentioned earlier. 



(4) Use as much available hardware as possible. Develop 
and build as little new equipment as possible to meet 
requirements. Acquire engineering test models of 
actual and/or representative systems/subsystems of Air 
Force satellites. 

(5) Focus on ASM-required changes only ; design life and 
performance advancements not needed for ASM should 
not be pursued. 

(6) Hold frequent reviews and conferences for technical 
information exchange with all concerned industrial, 
academic, and government organizations. 


II. General Plan Description 

The study participants recommend that the progrant con- 
sist of four major elements, prefaced by a three-month start-up 
period; first, an activity addressing existing programs at the 
subsystem level, producing demonstration products within two 
years; second, a system-level project addressing ASM-enhanced 
Air Force programs with proof-of-concept in five years; third, 
applications research directed at filling technological gaps, and 
fourth, an advanced system development, aimed at the 1990s. 
to provide an opportunity for unconstrained research to 
expand capabilities beyond the foreseeable future. These 
elements are denoted as Tasks 1 . 2. 3. and 4 respectively . The 
advanced systems development. Task 4. is identified tor com- 
pleteness, but because its products would not meet the 1989 
launch requirement, resources are not identified. The Research 
Agenda elaborates Task 4. 


A general view of the plan is shown in Fig. 1 . m which the 
arrows indica.e typical points of technology transfer between 
tusks to the System Program Offices. All tasks start at the 


beginning of CY81 to allow program definition and start-up to 
take place in the first three months of FYHl. Task 1 is a 
two-year activity that assesses increased fault detection. 
Kolation. and recovery for existing subsystems. Design changes 


will be made and breadboard units wdl be modified to lest 
,\SM capabilities and benefits Task 2 is a 5-year activity that 
includes a top-down system development and the necessary 
new subsystem technology developments requited for ASM A 


pre Pluse A effort is required to prepare a pnKurement speci- 
fication and to select the contractor lor both Task _ and 
Task 3. In Phase A. »hc mission r-’quiremenls and spacecraft 
design will be established, while in Phase B. the fabrication. 


m. Task Dtacriptlont 

The layout of the entire program is shown in more detail in 
Fig. 5. In the view of the study participants, the plan repre- 
sents the best method of addressing the urgency of obtaining 
an ASM readiness, given the available resources. The relative 
times needed to accomplish the objectives are shown; reduced 
funding or delays in program start-up will result in commen- 
surate delays in completing the tasks described below. 

A. Tmk1:ExittlngSubsytlMnR«dMigntoASM 

The first task is a 24-month effort to characterize the sub- 
systems involved with ASM, redesign the breadboards, check- 
out subsystem ASM functions, and provide measures of 
capability required to accommodate ASM. These measures will 
be in such terms as memory size and throughput. Because the 
subsystems are well known, it is felt that modifying them to 
include ASM features will be the quickest and most cost- 
effective way to size the challenge early and to incorporate 
some ASM capability into the spacecraft. When successfully 
demonstrated, the System Program Offices could consider 
them tor operational use. 

It is expected that much of the design work, and perhaps 
the breadboards, would be important to the Task 2 etfori, 
and heavy interaction between tasks should be anticipated. 
The subsystems to be studied are (ranked by their ASM 
importance), attitude and articulation control, power, telem- 
etry and data handling (including tape recorders), payload, 
communications, propulsion, and thermal control. Structure 
and mechanical devices are not included because their design 
IS little impacted by ASM requirements. It is recommended 
that two contractors perform on each subsystem to gain a 
diversity of experience for contractor and program applica- 
tion. As no two designs are the same, additional information 
will be gained from this approach to broaden the data base. 

The first six months is spent on design study. The subsys- 
tem's fault characteristics will bo e.xamined, and the fault 
detection, isolation, and recovery techniques will be devel- 
oped. Tlie hierarchical assignment of fault recovery between 
faults totally handled within the subsystem and those "passed 
on " to the system for action, will be developed. Evaluation of 
the reliability of sensors and switching, which are essential to 
“error free " ASM. will be done. Changes in design techniques, 
instrumentation, and ass»Kiated software or firmware as well 
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Fig. 5. Autonomous SfMODcrcfl mainttnancD program 


as the hardware will be covered. An assessment will be made as 
to what benefits accrue in reduced ground maintenance with 
the recommended ASM capabilities in the subsystems. In the 
next nine-month period, detailed design takes place. Algo- 
rithms for ASM will be defined, coded, and debugged. Hard- 
ware modifications and software changes will be made. ASM 
design features may be implemented in single-string fashion so 
that in this exercise only the subsystem under test will be fault 
tolerant. 

In the last year of the program, the breadboards or engi- 
neering test models and associated support equipment will be 
modified to include the ASM features, and then tested. The 
testing w ill be a rigorous exercising of the fault -processing 
logic by injection of all tvpes o»' faults. The testing will provide 
specific valid measures of ASM performance and design 
requirements in such terms as memory requirements, speed, 
and recovery algorithms, w hich can be utilized by various pro- 
gram offices as appropriate in their tmgoing or new prograrris. 

At the conclusion of Task 1 , impacts of ASM will be clearly 
established Fault chancier will be understood, new sensor 
and switching technoU^gy will emerge; software and hardware 
will he sized to do the job; algorithms for handling faults will 
be checked out. many system issues will be discovered for 
resolution in Task 2. and finally, the System Program Offices 
will have an opportunity to assess ASM applicability at the 
subsystem level 


B. Task 2: ASM Syatam Damonstration 

This second recommended task is a five-year activity that 
ends in a system-level demonstration of ASM. It is laid out 
very much as a typical flight project might be, but truncated at 
the system test of a prototype spacecraft with no flight hard- 
ware built. Th? assumption is made that the system demon- 
stration will be achieved by applying ASM to one mission, 
such as DSP, DMSP. or GPS, but the extension of this ASM 
technology to all Air Force missions will be an active design 
consideration. The reviews are typical, with only the System 
Test Requirements and the Proof of-Concept Acceptance 
Reviews being unique to this program. The phases are typical 
as well: systems analysis and requirements generation, system 
design; subsystem design, fabrication, and lest; and system 
integration and test. 

The analysis and requirements activity proceeds during the 
second year and culminates at the Preliminary Design Review 
with the production of the Mission and System Requirements 
diKHiment. Tlie activity includes mission impacts, recovery 
strategy, degradation profiles, and data return sirategs as 
faults occur; reliability and risk analyses; operation analysis 
with ASM; tliglit ground tradeoffs; spacecraft system fault 
analysis, development of the ‘layered’ Uuli protection system 
architecture, fault detection, isolation, and recovery at tfie sys- 
tem level; payload interaction with ASM; and in-flight naviga- 
tion requirements generation. 
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The ASM system design occun during the second and third 
year, culminating at the Critical Design Review. The design 
team will study alternate design approaches; allocate functions 
between hardware, firmware, and software; study distributed 
vs central computing; analyze performance; write specifica- 
tions; and have the usual heavy system/subsystem interaction 
on design (including Task 1 personnel). The key product will 
be the Spacecraft System Specification, available at the Criti- 
cal Design Review. Another important part of this period is 
the system test requirements to be imposed. The design must 
allow access for fault injections during test, which may not be 
easy to implement. The System Test Requirements Review 
will address adequacy of testing to prove that ASM has a 
flight-ready capability. 

The fourth and fifth years of the ASM system activity will 
be used to redesign, fabricate, and test the subsystems, and 
then integrate them and test the system for proof of concept. 
Where possible, the subsystems from Task 1 will be used, but 
modifications will still have to be made to integrate them into 
the overall system design. Redesign, fabrication, and test 
should take 15 months. The subsystems will be delivered to 
system test at 5 1 months into the project. 

The test facility preparation starts at the beginning of the 
fourth year, and must be completed by subsystem delivery. 
Support equipment must be designed or modified as needed. 
It must be determined how the fault injection and testing will 
be done for proof-of-concept testing. In addition to the differ- 
ent states that the facility will have to test, it must also be able 
to simulate the power source, thrusters, spacecraft dynamics, 
and mechanical devices. 

Finally, system integration begins at 48 months, and proof- 
of-concept testing begins at 54 months. The system-level proof 
of concept will be a full electrical demonstration of the ASM 
system, under the test conditions already mentioned. Testing 
will be performed in a laboratory ambient environment. 

Throughout the program, attention will be paid to any 
spacecraft block changes to ongoing programs that may pro- 
vide an early opportunity for ASM application. Block changes 
will not necessarily affect the ASM system demonstration 
project, but if one occurs at an opportune time, some of the 
system development may be directed toward it. 

The Proof-of-Concept Acceptance Review is the final mile- 
stone in the system activity. Test results would be examined 
for validity and completeness, and if the review is success- 


ful, ASM will be demonstrated as a viable, implemcntoble 
technology. 

C. Tartt 3; AppMcHoiw Ww ar ch 

Task 3 is a new technology research and development 
activity of five years duration that addresses known gaps at 
the subsystem level. Two are currently identified: a dis- 
tributed fault-tolerant data processor with a nonvolatile com- 
puter backup memory, and an autonomous navigation sub- 
system. Figure 5 shows the development schedule for these 
items. It is expected that the breadboards for these subsystems 
would be used in the system proof of concept. If not avail- 
able, appropriate simulators/emulators would have to be 
provided. Resources for in-flight navigation arc not included 
here because it is assumed that currently-funded programs 
elsewhere can be expected to produce the needed breadboard 
in 1983. 

D. Task 4: Advanced ASM Syttom D«v«lopflMnt 

As mentioned earlier, this effort is comprised of the Re- 
search Agenda of the next section. The products of that 
research will fold into the “second-generation” ASM system of 
the 1990s. 

E. Program Coat EsUmata 

A budgetary cost estimate for Tasks 1 , 2, and 3 is shown in 
Table 7. This does not include funds for the developmen* of 
the autonomous navigation capability, which is assumed to be 
handled in another program. These figures should be con- 
sidered only an estimate for several reasons. First, the cost of 
developing the new technology is not well known. Second, a 
sprecific mission application has not been assumed, and so 
candidate spacecraft could not be assumed. Finally, substanti- 
ating data was not provided by the individual participants. 
For these reasons, a more definitive cost study should be per- 
formed in the initial phases of the activity , 


IV. Summary 

The Implementation Plan presented here, in the view of the 
study participants, represents a balanced, focused attack on 
the Air Force's spacecraft maintenance problems. System, sub- 
system. and new technology elements are all pursued at a level 
sued to the difficulty of the specific ASM challenge while 
recognizing the needs for early demonstrable results for on- 
going operational programs. The above described implementa- 
tion plan is recommended by study participants as the basis 
for the Air Force's ASM program. 
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Task 


Program element 


Tabto7. ASMpfoyram 


CY81 


CY82 


Progrun r«tourcts, $k* 


CY83 


1 

Existing subsystems^ 
redesign to ASM 

2.900 

3.500 

800 

1 

V, f 

lOUUS 

7.200 


2 

ASM system design^ 
and demonstration 

SOO 

4.500 

7.500 

7.500 

3.000 

23.000 


3 

Applications research 

600 

800 

1.300 

2.300 

1.200 

6.200 


j/"' - _ » 

Totals 

4.000 

8.800 

9.600 

9.800 

4.200 

36.400 



higurcs arc I Y 80 dollars 
^Two contractors per subsystem assumed. 
‘'Single sy stem contractor assumed 


autonomous navigation development not included. 
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Research Agenda 


I. introduction 

A research program to support the development of auto- 
mated spacecraft maintenance must focus on the most critical 
problems expected in that development. A major challenge is 
to channel a great deal of fault-tolerance expertise, developed 
for other applications, into work that will specifically benefit 
the space program. Fortunately, most of the ASM spacecraft 
problems are shared by many other applications (e.g., process 
control, avionics, and robotics) and are sufficiently general in 
scope to be of considerable interest to the academic com- 
munity. This proposed Research Agenda is organized around 
specific spacecraft development problems. An underlying 
cause of most of these problems is increasing complexity. 
The basic motive force is rapidly expanding capabilities of LSI 
and VI.SI technology. Until very recently , satellites contained 
a few hundred to a few thousand integrated circuits. Each 
integrated circuit contained a few gates or registers, and the 
collection of integrated circuits were combined to form a 
system. We will soon tly single VLSI chips that contain thou- 
sands of gates and mem4)ry cells such that each chip is itself a 
complex subsystem in a tiny package. This can result in an 
enormous increase in t'unctional capabilities in satellites. 
Onboard navigatum, very-high-performance signal processing, 
threat evasion, pattern recognition, and a host of other capa- 
bilities will become feasible. 


It is expected that fault tolerance will be an important 
attribute of VLSI design because of the problems of transient 
faults and testability. The very high complexity of large VLSI 
systems is expected to result in transient faults every few 
minutes, or every few hours. Current experience indicates a 
transient error rate in LSI memory of one error per hour per 
million bits. Even if this rate is reduced an order of magnitude, 
impaired operation will occur unless the spacecraft system is 
designed to detect and recover automatically from these fault 
conditions. The problem of thoroughly testing complex VLSI 
circuits has only recently been recognized. Many existing 
devices are essentially untestable and design faults are uncov- 
ered in the field after prolonged usage. Since the only access to 
a finished device is through a limited number of pins, it ^ 
becomes nearly impossible to exercise all internal states of a 
device containing thousands of transistors. Testable design 
methodologies have been recognized as a high-priority research 
problem in industry, and is even more critical for space appli- 
cations. 

New and largely unexplored problems are expected at the 
system architecture level of space systems. Proliferation o\' 
specialized microelectronic controllers in spacecraft sub- 
systems will lead to more complex cooperation between sub- 
systems, between the spacecraft and ground, and perhaps 
between different spacecraft. Omsequenily, the system 
organization, software, and fault-tolerant aspects o\' space- 
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craft will have to become correq>ondingly more complex. A 
hierarchy of computus processes is envisioned. This implies 
more onboard monitoring circuitry to detect faults before 
errors propagate through the ^stem, making diagnosis and 
recovery extremely difficult. Automatic trend analysis may be 
emfdoyed to record and discover new error patterns, and 
heuristic recovery algorithms may be required to recover from 
unanticipated faults. 

In summary, there exist a variety of research areas that are 
rich in both substance and applicability, and are essential to 
the development of future space systems. 


II. RMMrchPlan 

Recognizing that resources are limited, the following re- 
search plan is broken into five areas that are essential to future 
ASM development: (I) VLSI technology, (2) architecture of 
advanced ASM systems, (3) software fault tolerance, (4) mod- 
eling and analysis, and (5) supporting development. In terms 
of criticality, they are listed in descending order. Subsystem 
technology and especially related VLSI issues must be resolved 
no matter what architecture is chosen. An understanding of 
system architecture is necessary to make modeling and analysis 
more useful and relevant. 

A. VLSI Tachnology 

Testing of LSI devices is a serious and expensive problem 
in current spacecraft. It will be a critical issue in ASM systems 
because it is necessary to detect faults very quickly after their 
occurrence, so that autonomous recovery mechanisms can 
restore the spacecraft to normal operation with minimal dis- 
ruption of performance. This research area has two compo- 
nents: ( 1 ) self-testing VLSI, and (2) on<hip redundancy. 

1. Self-testing VLSI. The first goal of this component is 
to develop methodologies to design VLSI chips that are 
(I ) thoroughly testable prior to normal operation, and (2) self- 
checking. A methodology for designing self-checking circuitry 
has been developed that allows the chip to detect internal 
faults concurrent with normal operation. We must learn to 
design a chip that will be fully exercised and tested during 
ntirmal operation. Lven if we can detect a fault when it occurs, 
it may be months or years before a complex chip in normal 
operation enters a faulty state. Thus, development of “easily" 
testable circuits is a high-priority research item. 

2 . On-chip redundancy. A second goal of this research is to 
investigate the use of on-chip redundancy to improve yield and 
chip reliability. Although the existence of catastrophic failure 
modes make it necessary to back up individual chips with 



spares, the use of on-chip redundancy may greatly improve 
chip reliability and system life. 



This area includes the hardware and software organization 
to achieve fault tolerance in highly complex ASM spacecraft. 
This work must take into account the trend toward prolifera- 
tion of cmnputers in spacecraft subsystems, and it should be 
directed toward future space systems in which dozens of dis- 
tributed computers may be used. It should address the impacts 
of VLSI in spacecraft architecture, performance, and ASM 
capability. It is expected to support ASM spacecraft develop- 
ment beyond 1990 . 

To effectively involve the academic community in space- 
craft system research, it will probably be necessary for the 
USAF to develop a set of strawman system requirements. 
Most members of the academic community are not familiar 
with the unique problems of space systems (e.g„ power, 
weight, volume, uplink and downlink, instruments, testability, 
command interfaces, and subsystem operation). Thus careful 
problem definition is required to focus this work toward real 
space problems. Such strawman systems might include a robot 
for in-space assembly, or a satellite that must correlate and 
make decisions on multiple sensor inputs. 

The following architectural tasks are Iiighly interrelated 
(e g., hardware and operating systems studies), and mechanisms 
for frequent interchange ot information between groups work- 
ing in this area are very important. A series of workshops might 
be one such mechanism. 

The following tasks have been identified: (1) organization 
studies, (2) operating systems for large hierarchic space sys- 
tems. (3) recovery by problem solving, (4) fault tolerance 
in very -high-performance priKCssors. (5) architecture 

development. 

Underlying Tasks 1 . 2, and 3 is the need to develop a hier- 
archic model of complex distributed functions in ASM space- 
craft, and models of the interfaces between spacecraft subsys- 
tems. Computing in each spacecraft subsystem generates a 
virtual digital interface between the subsystem and the 
spacecraft system. Models of these interfaces should include 
generalized fault monitoring and recovery functions at each 
level of the hierarchy . Such models may lead to insights on 
how to structure these interfaces to improve software reliabil- 
ity. and fault recovery, as well as simplified commanding and 
system integration. 

I. Organization studies. These studies will include postu- 
lating fault-tolerant distributed, and hierarchical computer 
architectures along with communication formats, and software 
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executive structures that are appUcable. The first goal of this 
component is to perform tradeoffs and pinpoint the relative 
capabilities and limitations of the postulated architectures 
with respect to spacecraft performance and fault tolerance. 
The second goal is to develop specific fault-tolerance tech- 
niques for use in these types of systems. Among the fault- 
tolerance questions to be addressed are: 

(1) How can reliable clocking and synchronization be 
carried out between the multiple processors? 

(2) How can embedded processors with their numerous 
input/output pins be spared? 

(3) How can nonhomogeneous specialized processors be 
handled, especially when fault-tolerant architectures 
are biased towards a homogeneous pool of processors? 

(4) How is executive software organized to support re- 
covery, rollback, and diagnosis? 

(5) Can the system be designed to tolerate software errors 
through fault-containment? 

(6) How does one design virtual interfaces that partition 
software between various computer modules? 

(7) How is redundancy distributed? What fault detection 
is provided at the various sensor/actuator levels within 
the computer, subsystem, and system levels? What 
are the levels ot sparing employed on chips, between 
chips, and between subsystems? 

(8) How well can fault-tolerance features be made trans- 
parent to the user? 

2. Operating systems for large hierarchic space systems. 
This research is directed at developing operating system con- 
cepts best suited to complex distributed systems. Issues to be 
addressed are: 

(1) Hierarchical partitioning of executive functions 
global/local. 

(2) Effect ol alternative executive structures on applica- 
tion software reliability, testability, and fault- 
containment. 

(.^) Interaction of executive with hardware and software 
tault-iolerancc mechanisms. 

(4) Provability of correctness of the executive. 

(.^t Robustness the ability of the operating system to 
survive errors in applications software. 

3. Recovery by problem solving. Many of the techniques of 
artiticial intelligence and problem solving may be applicable in 
dealing with unanticipated fault conditions, or with operator 
errors This task is intended to develop heuristic techniques to 


deal with this class of unexpected faults and possibly 
some errors. 

4. Fault-tolerant high-pcrfomittice processors. This area 
includes the processors that will be or are being developed 
(e-g-, signal processors). Techniques to achieve fault detection 
and recovery, and also to integrate such systems in ASM satel- 
lites, require investigation. This is especially true because many 
of these systems will probably not work without embedded 
fault tolerance due to a high transient error rate brought on by 
enormous complexity. 

5. Architecture development. To use ASM in a satellite, 
the supporting technology must be in place. Project offices 
are usually in no position to accept the delay and risk of 
developing new technology. Thus, this research program 
should develop one or more fault-tolerant computer system 
architectures to at least the breadboard stage. Fault-tolerant 
architectures are sufficiently complex that it is necessary to 
build and test them to understand their behavior. It is expected 
that the selection and design of these architectures would be 
outgrowths of current architecture developments (Software 
Implemented Fault Tolerance. Fault-Tolerant Multiprocessor, 
Fault-Tolerant Spaceborne Computer, and Budding Block 
Fault-Tolerant Computer), which would be heavdy influenced 
by the organization studies above. 


C. Softwarw FauN Tolaranca 

This research area is concerned with developing reliable 
software for distributed computer systems for ASM space- 
craft. It includes three areas of study: (1) system partitioning 
and interface definition to improve software reliability. 
(2) self-checking flight software, and (3) fault-tolerant 
software. 

1 . Partitioning and interface definition. This task is tightly 
coupled with the architecture studies. The partitioning of 
functions within a distributed system and the virtual inter- 
faces between subsystems have a very large impact on the 
complexity and reliability of applications software. The goal 
of this research is to study tradeoffs between alternate parti- 
tioning and interface (command and data) definitions and their 
impacts on software complexity and reliability. (Such issues as 
the degree of system vs local control of a subsystem, timing 
requirements on commands, acceptable communications 
delays, scheduling strategy, and internal software structure, 
are involved in these studies.) 

2. Self-checking flight software. One goal of this task is to 
develop methodologies for detecting faults in applications 
software as it is performing its normal operations. This in- 
cludes the inclusion of acceptance tests in the flight programs 






and a variety of other software fault detection mechanisms. A 
second goal of this task is to develop verification and valida- 
tion techniques, to prove the effectiveness of this self-checking 
code. 

3. Fault-Tolerant software. This task is intended to 
develop techniques for developing software that operates in 
the presence of programming errors. This is a difficult area that 
involves the use of software fault detection and the execution 
of redundant code to recover from design faults. 

D. Modoling and Analysis 

In the development of advanced ASM spacecraft systems, it 
is necessary to develop experimental testing techniques to 
verify the effectiveness of the built-in fault-tolerance mecha- 
nisms. Analytic statistical models that use these experimental 
results and component failure rates are then required to pre- 
dict the reliability and performability of the ASM spacecraft 
as a function ot time. This type of modeling is essential to 
determine it a given spacecraft design will meet its objectives, 
or to perform tradeofts between competing design approaches. 

A second class ot tools receded in ASM development are 
functional models and design languages that can facilitate 
design and verification of ASM systems. Such tools could 
provide the capability ot specifying and simulating operations 
of proposed systems, and allow changes and improvements 
before a design is locked into hardware. A second important 
use of design languages and functional models is to provide a 
basis tor tormal veritication ot a design before launch, and 
validi.tion of command sequences to an orbiting ASM 
spacecraft. 

The modeling and analysis area is broken into three compo- 
nents: (1) experimental testing, (2) statistical modeling, and 
(3) functional description, modeling, and verification. 

I. Experimental testing. Spacecraft testing, already a dif- 
ficult problem, will become considerably more complex with 
the introduction ot ASM. A significant problem is how to 
test these tunctions that are dedicated to autonomous mainte- 
nance. The goal of this component is to acquire a deeper 
understanding o\ testing problems peculiar to ASM. and 
develop test generation and test applications methods for 
solving these pri>blems. Among the problems to be considered 
tfiat complicate testing are: (1) testing at many levels in a 
hierarchy. {2) the possibility i)f many combinations of input 
events and many unanticipated faults, (3) the need \o test in 
an artificial environment. (4) wear-out phenomena, and 
(5) the development of specific tests required by statistical 
reliability models. 


2. Statistical modeling, ftobabilistic models of ASM space- 
craft are needed to assess the probabilities of performing at 
various levels, ranging from full performance to failure, over 
the projected life of the spacecraft. Such models arc being 
developed, but have yet to be extended to complex, hetero- 
genous spacecraft systems. The goal of this component is to 
extend current methods, developed primarily for computer 
applications, to accommodate additional complexities in 
large, heterogeneous spacecraft systems. A considerable 
advancement is required over existing models to deal with the 
complexity resulting from dependent subsystem failures, and 
the models must carefully relate to testing results as input 
parameters. Special emphasis must be placed on modeling and 
analysis of transient faults, since transients are expected to be 
a major problem of VLSI technology. 

3. Functional description, modeling, and verification. 

Spacecraft systems are complex, multifunctional real-time sys- 
tems wdth many different types of physical subsystems. 
Although functional models may exist for many subsystems, 
functional descriptions at the spacecraft level are typically 
informal and incomplete. With the additional complexities of 
VLSI and autonomous maintenance, informal design methods, 
particularly at the system level, may no longer produce the 
desired results (witness the evolution of computer operating 
system design methods). 

One goal of this component is to investigate whether 
design languages, such as those being used in the context of 
computer and computer-based systems, can be usefully ex- 
tended to facilitate spacecraft design. Of particular relevance 
are languages that call for timeliness, fault tolerance, distri- 
buted resources, and concurrent (parallel) execution of tasks. 

A second related goal is to develop uniform functional 
models (abstract representations) of autonomously maintained 
spacecraft. The models sought are hierarchical models that 
relate high-level functional behavior of the total system to 
lower-level subsystem functions and interactions, both during 
normal operation and in various modes of fault recovery or of 
degraded operation. This type of model can facilitate the 
design process and may, in the future, lead to design automa- 
tion tools for spacecraft design. Functional models might also 
be used to formally verify the system. 

With the increase in logical complexity required for advanced 
ASM spacecratt. model-based evaluation and testing may not 
suffice to provide (he desired confidence in the system. Tlie 
third gi)al is to investigate the possibility of extending formal 
verification methods (such as those being developed for pro- 
grams. operating systems, and at least one avionics processor) 
so as to apply to formal descriptions of spacecraft. 
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The areas of specification language, formal functional 
models, and formal verification techniques are intimately 
related, and are thus grouped into one component in this 
plan. This represents long-term, high-risk research, but the 
payoff can be enormous. 


Two supporting developments have been suggested by the 
research group. The first, of imniediate urgency, is ASM data 
base development. The second is development of a spacecraft 
laboratory for ASM integration patterned after a similar 
development within NASA. 


1. Data Base Development. A comprehensive data base 
should be established for ASM development. It should serve 
as a repository of two types of data. 


The first is statistical data required to determine values of 
parameters for reliability and performance models. Currently 
used data is often incomplete and inaccurate. Data sought 
should include piecepart failure data (particularly transient 
failures), data on VLSI failure mechanisms, and data on sub- 


system failures, system failures, and that on the environment 
gathered from past missions. 


The second type of data is information on existing (perhaps 
generic) spacecraft systems and subsystems, and information 
on redundancy and ASM techniques already being used on 
spacecraft. There is a significant problem of technology trans- 
fer between spacecraft designers and researchers. This type of 
data base would provide a multidiscipline exchange that may 
be indispensable in advancing the state of the art in ASM. 


2. Spacecraft Laboratoiy. This is a much more ambitious 
development. It would consist of a computing facility for ASM 
spacecraft integration (analogous to NASA’s Airlab for avion- 
ics systems). This is envisioned as a facility where spacecraft 
simulations would be provided. New hardware/software sub- 
systems could be integrated and tested using the system, and 
new system designs could be developed and simulated. Such a 
facility would be used for experimental testing of prototype 
spacecraft systems. It would be national in scope and provide 
both access and a focus for information exchange between 
manufacturers and researchers. 





Conclusions and Recommendation 


ASM is a logical, evolutionary change to the Air Force's 
concept of space system operations that results in the transfer 

w u ground segment to the space segment 

With ASM. the role of the ground segment becomes one of 
supervisory control and operations management, rather than 
detaUed control of operations. Experience has shown that, 
generally, spacecraft have enough spares to meet mission life- 
time requirements, so additional redundant parts are not 
necessarily needed for ASM. 

In the opinion of the study participants, the present system 
anc. current spacecraft operate quite well in that mission objec- 
tives and user data needs are satisfied. The present space seg- 
ment. however, was designed to operate with man and his 
ground contr.il function as an integral element. Successful 
space segment operation currently requires closure through the 
ground segment both before and after the occurrence of faults. 
ASM will remove this requirement from the day-to-day acfiv- 
ities o\ space segment operations. 


i. Conclusions 

The following conclusions are those of the study group 
participants and result from analysis of the material developed 
during this study. 


1 . ASM would reduce the vulnerabdity problem. There is a 
need to decrease space segment dependence on the ground 
^gment because the ground segment is vulnerable to both 
hostile action and operator error. By eliminating dependence 
on the ground segment for fault detection, isolation and 
recovery management, and for routine operations functions 
such as power management and ephemeris updating, space 
system vulnerability will be signiHcantly reduced. 

2. The ASM capability need not impose operational con- 
straints on the system user. If anything, the user should per- 
ceive a more responsive spacecraft with ASM present. New 
procedures for user system operations and data retrieval 
should not be required. Data outage resulting from most 
internal faults would be reduced from hours to seconds. 

making the ASM capability virtually transparent to the space 
segment data user. 

3. ASM would require a change in the conduct of opera- 
tions and control. The role of the ground segment in system 
operations must be redefined. Detailed control of routine 
operations and maintenance functions would be assumed by 
the space segment, with supervisory ground control. Supervi- 
sory control would be maintained by an audit trail capability 
that would provide nonreal time (up to 6 months) visibility 
into maintenance actions, and by the capability for ground 
segment override of space segment autonomous actions 
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4. ASM would add complexity to die qiacecraft design; 
theiefore, new metliods for specifying, testing, and validating 
ASM-augmented spaceciaft are needed. ConcepU for specify- 
ing, testing, and validating ground-based, fault-tolerant pro- 
cessing systems have recently been developed. Interaction 
between computer and spacecraft technologists during this 
study has shown these concepts to be applicable to ASM. New 
method*^ igies for design and analysis are required to address 
such is; .es as fault coverage and recovery latency, measures of 
effectiveness, risk assessments, and proof-of-correctness. 

5. A more effective means of transferring technology from 
research to applications programs would be required. The ASM 

study has served as a forum for the exchange of technology 
between researchers and application specialists. A continuation 
of information exchanges between these two communities will 
increase the level of awareness of both the technological prob- 
lems and their potential solutions. As noted above in item 4, 
the collective experience of fault-tolerant data processing sys- 
tem specialists can serve as a surrogate for guiding the evolu- 
tion of new spacecraft methods and new technologies required 
to satisfy space segment environmental constraints. 

6. New technology developments would be required. Two 

specific technological developments were identified: 

(1) A highly reliable fault-tolerant computing capability 
with nonvolatile back-up memory to enable autono- 
mous maintenance. 

(2) An autonomous navigation capability to enable inde- 
pendence of routine ground operations. 

The fault-tolerant computing system is expected to have 
complete authority over spacecraft resources employed during 
reconfiguration by using hierarchical recovery management 
algorithms, diagnostic test procedures, fault-trail reporting 
mechanisms, and normal spacecraft operations. This authority 
must manage contention for system resources and manage 
subsystem interdependencies arising during anomolous opera- 
tions The conceptual design requirements for 60-day /6-month 
autonomy necessitates moving the navigation function from 
the ground to the spacecraft. 

7. A strong corporate commitment to ASM by the Air 
Force would be required to make ASM successful. The imple- 
mentation of ASM would be a phased program, with the 
spacecraft fleet evolving from non-ASM to ASM spacecraft 
over a period of several years. The spacecraft would not 


instantly become totally autonomous. The pace of ASM 
development and implementation would depend upon the 
resources, technology, and chosen program applications that 
are provided. To plan the implementation of ASM and coordi- 
nate the actions of the System Program Offices and the 
ground segment, a strong, long-term corporate commitment 
would be needed. This would insure successful integration of 
ASM into the Air Force’s space system. 

8. Confidence in ASM must be instilled by creation of a 
systematic modeling, analysis, and demonstration program. 

Total confidence in ASM will result only after operations are 
proven to be predictable and understandable. However, 
proof-of-concept demonstrations of such individual ASM 
capabilities as battery reconditioning, autonomous recovery 
from bus undervoltage conditions, and autonomous computer 
self-diagnoses, will help provide early confidence in ASM, 
Confidence will be further established during the transition 
phase when quantitative figures of merit for ASM and non- 
ASM strategies can be developed within the flight 
environment. 

9. ASM IS a viable concept. ASM is the technological 
infusion of ground-based functions into long-lived, highly- 
reliable spacecraft. These functions are well understood and 
operating successfully on the ground now. Precepts borrowed 
from fault-tolerant computing will provide guidance for eval- 
uation of fault-detection, isolation, and recovery techniques 
appropriate to the space environment. Other studies are in 
progress that will provide insight into the solution of the 
autonomous navigation problem, and no other technology 
gaps have been identified. Thus, ASM is workable and, given 
the urgency of the present situation, it should be started now. 


The study group recognizes the need for ASM, and has 
found the technology available in today's spacecraft systems 
to be a good foundation from which to proceed to ASM. 
The plan presented is practicable; it aims at a series of prudent, 
gradually expanding (from subsystem to system level) capabil- 
ity demonstrations. The study group therefore recommends 
that the Air Force proceed with the technology development 
and research programs as outlined in the Implementation 
Plan and Research Agenda. These programs would provide 
the earliest possible demonstration of ASM as a valid system- 
level capability , and lay the research-oriented groundwork for 
the '‘second generation" ASM of the 1990s. 


II. RacomnMndation 
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